1. Technical Field
The present disclosure relates to determining traffic quality with respect to online content.
2. Discussion of Technical Background
Online advertising plays an important role in the Internet. Generally there are three players in the marketplace: publishers, advertisers, and commissioners. Commissioners such as Google, Microsoft and Yahoo!, provide a platform or exchange for publishers and advertisers. However, there are fraudulent players in the ecosystem. Publishers have strong incentives to inflate traffic to charge more from advertisers. Some advertisers may also commit fraud to exhaust competitors' budgets. To protect legitimate publishers and advertisers, commissioners have to take responsibility to fight against fraudulent traffic, otherwise the ecosystem will be damaged and legitimate players would leave. Many current major commissioners have antifraud system, which use rule-based or machine learning filters. These filters usually mark each impression and click with binary flag, either valid or invalid. However, it is hard to simply draw a line between what is valid and invalid. In fact, there is suspicious traffic in a gray area that is not good enough to be valid or not bad enough to be invalid.
Moreover, the data related to ad conversion (i.e., post ad-clicking user activity at advertisers website, etc.) may be sparse, and sometimes advertisers may not be willing to send ad network their conversion data, which makes conversion data collection infeasible. Further, even if advertisers are willing to send ad network their conversion data, it may be that ad conversion tracking is misconfigured, and so the collected conversion data itself may not be of good quality or reliable. Existing traffic quality scoring may only get a traffic quality score at some coarse grain (e.g., as a binary decision flagged as valid or invalid) to mitigate sparse ad conversion data sparse issue, and this may make it different to evaluate traffic quality for relatively small entities in an ad network that may only have relatively small traffic volume.